On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation

نویسندگان

  • Gavin C. Cawley
  • Nicola L. C. Talbot
چکیده

Model selection strategies for machine learning algorithms typically involve the numerical optimisation of an appropriate model selection criterion, often based on an estimator of generalisation performance, such as k-fold cross-validation. The error of such an estimator can be broken down into bias and variance components. While unbiasedness is often cited as a beneficial quality of a model selection criterion, we demonstrate that a low variance is at least as important, as a nonnegligible variance introduces the potential for over-fitting in model selection as well as in training the model. While this observation is in hindsight perhaps rather obvious, the degradation in performance due to over-fitting the model selection criterion can be surprisingly large, an observation that appears to have received little attention in the machine learning literature to date. In this paper, we show that the effects of this form of over-fitting are often of comparable magnitude to differences in performance between learning algorithms, and thus cannot be ignored in empirical evaluation. Furthermore, we show that some common performance evaluation practices are susceptible to a form of selection bias as a result of this form of over-fitting and hence are unreliable. We discuss methods to avoid over-fitting in model selection and subsequent selection bias in performance evaluation, which we hope will be incorporated into best practice. While this study concentrates on cross-validation based model selection, the findings are quite general and apply to any model selection practice involving the optimisation of a model selection criterion evaluated over a finite sample of data, including maximisation of the Bayesian evidence and optimisation of performance bounds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation and selection of sustainable suppliers in supply chain using new GP-DEA model with imprecise data

Nowadays, with respect to knowledge growth about enterprise sustainability, sustainable supplier selection is considered a vital factor in sustainable supply chain management. On the other hand, usually in real problems, the data are imprecise. One method that is helpful for the evaluation and selection of the sustainable supplier and has the ability to use a variety of data types is data envel...

متن کامل

Fitting of Count Time Series Models on the Number of Patients Referred to Addiction Treatment Centers in Semnan County

Abstract. Count data over time are observed in many application areas. Many researchers use time series patterns to analyze this data. In this paper, the poisson count time series linear models and negative binomials on this type of data with the explanatory variables are studied. The Likelihood analysis and the evaluation of count time series model based on generalized linear models are pres...

متن کامل

 Airline Alliances Partner Selection in Uncertain Environment: A Fuzzy Hybrid Evaluation Model Based on BSC

With respect to the importance of commercial alliances in airline hypercompetitive environment, partner performance evaluation is a critical matter before making any decision about partnership. In this article, utilizing Fuzzy theory, a hybrid airline evaluation model is developed in a way that all airline performance strategic dimensions are considered, as well as, it can deal with qualitative...

متن کامل

The Effect of Inflation Targeting on Indirect Tax Performance in Selected Countries Using Propensity Score Matching Model

Inflation targeting framework has become a predominant monetary approach across the globe. Williams (2015) believes that in a very real sense, almost all economies are inflation targeters -either explicit or implicit- now.(1) Due to the increasing spread of this policy, it is necessary to consider the way it affects macroeconomic variables. using prevalent economic models for evaluating the eff...

متن کامل

Characterizing the generalization performance of model selection strategies

We investigate the structure of model selection problems via the bias/variance decomposition. In particular, we characterize the essential aspects of a model selection task by the bias and variance profiles it generates over the sequence of hypothesis classes. With this view, we develop a new understanding of complexity-penalization methods: First, the penalty terms can be interpreted as postul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2010